Skip to content

fix: preserve Claude stdio MCP env#1224

Merged
danielmeppiel merged 7 commits into
microsoft:mainfrom
imk1t:fix/claude-stdio-env-1222
May 14, 2026
Merged

fix: preserve Claude stdio MCP env#1224
danielmeppiel merged 7 commits into
microsoft:mainfrom
imk1t:fix/claude-stdio-env-1222

Conversation

@imk1t
Copy link
Copy Markdown
Contributor

@imk1t imk1t commented May 9, 2026

Description

Preserve env values for self-defined stdio MCP servers when generating Claude Code .mcp.json.

Root cause: self-defined stdio MCP entries pass env into the shared adapter formatting path as a plain dictionary. The Copilot runtime-substitution path already handled that shape, but the legacy literal-resolution path used by Claude Code only handled registry-style environment variable lists, so the dictionary resolved to an empty object.

This keeps the fix scoped to that literal-resolution path and adds a Claude Code regression test proving that .mcp.json preserves the declared environment variable.

Fixes #1222

Type of change

  • Bug fix
  • New feature
  • Documentation
  • Maintenance / refactor

Testing

  • Tested locally
  • All existing tests pass
  • Added tests for new functionality (if applicable)

Validation run:

uv run --extra dev pytest tests/unit/test_claude_mcp.py tests/unit/test_build_mcp_entry.py tests/test_codex_empty_string_and_defaults.py -q
uv run --project <repo> --extra dev apm install --target claude --no-policy
uv run --extra dev pytest tests/unit tests/test_console.py -x
uv run --extra dev ruff check src/ tests/
uv run --extra dev ruff format --check src/ tests/

The CLI smoke test used a generic self-defined stdio MCP entry with DEMO_ENV=demo-value and verified that the generated .mcp.json contains:

"env": {
  "DEMO_ENV": "demo-value"
}

@imk1t imk1t marked this pull request as ready for review May 9, 2026 15:07
@imk1t imk1t requested a review from danielmeppiel as a code owner May 9, 2026 15:07
Copilot AI review requested due to automatic review settings May 9, 2026 15:07
@imk1t
Copy link
Copy Markdown
Contributor Author

imk1t commented May 9, 2026

@microsoft-github-policy-service agree

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a Claude Code-specific regression where self-defined stdio MCP servers lost their declared env when generating .mcp.json, by teaching the legacy (literal-resolution) env resolver to handle the self-defined env dict shape.

Changes:

  • Extend _resolve_environment_variables to resolve env provided as a plain {NAME: value} dict when runtime env substitution is disabled (Claude Code path).
  • Add a unit regression test asserting that .mcp.json preserves env for a self-defined stdio server.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
tests/unit/test_claude_mcp.py Adds regression coverage for preserving env in Claude Code .mcp.json for self-defined stdio servers.
src/apm_cli/adapters/client/copilot.py Updates legacy env-variable resolution to support dict-shaped env inputs used by self-defined stdio MCP entries.

Comment on lines +791 to +802
if isinstance(env_vars, dict):
resolved = {}
for name, value in env_vars.items():
if not name:
continue
if isinstance(value, str):
resolved[name] = self._resolve_env_variable(
name, value, env_overrides=env_overrides
)
elif value is not None:
resolved[name] = value
return resolved
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@github-actions
Copy link
Copy Markdown

APM Review Panel: needs_rework

fix: preserve Claude stdio MCP env + normalize env values to strings -- fixes silent env loss for integer/float configs but ships a boolean coercion bug (False->"False") that needs a one-liner before merge

cc @danielmeppiel -- a fresh advisory pass is ready for your review.

The core fix is correct and should ship. This PR closes a real, high-frequency user pain: any apm.yml that sets numeric env values (PORT: 3000, TIMEOUT: 30) silently produced a broken Claude stdio MCP config. The new legacy-mode dict branch correctly routes non-string values into a str() coercion path and adds a regression test for the integer case. The panel converges on shipping this direction -- the fix is the right call and the code placement is sound.

However, the PR introduces a secondary correctness bug that the test suite does not catch: str(False) == "False" (capital F). DevX UX flagged this blocking, supply-chain confirmed it, and the test-coverage expert surfaced it as a missing-test gap. Shell tooling and 12-factor apps universally expect lowercase "false"/"true". This is not hypothetical -- any user with DEBUG: false or ENABLED: true in apm.yml gets a semantically wrong string baked into ~/.claude.json. The fix is a one-liner (str(value).lower() when isinstance(value, bool)) and must land in the same commit, not a follow-up. The panel also independently identifies the translate-mode dict branch (line 748-749) as inconsistent: it does not apply the same str() coercion, meaning the same input shape produces different output types depending on which code path runs. This asymmetry is a latent bug. The CHANGELOG omission is flagged by three personas (doc-writer, devx-ux, oss-growth) -- per project convention a user-visible bug fix requires a changelog entry before merge.

The print() on line 819 flagged blocking by cli-logging is pre-existing and not introduced by this PR. It should not gate merge; track it as a standalone cleanup issue. The supply-chain concern about str() on YAML list/dict values producing Python repr is real but low-urgency -- an explicit type whitelist is a reasonable follow-up.

Dissent. cli-logging-expert rated the print() on line 819 as blocking. This finding is correct on the merits -- raw print() violates the APM output architecture -- but the line is pre-existing and not introduced by this PR. Blocking a focused bug-fix PR on a pre-existing unrelated violation is disproportionate; the CEO sides with not gating merge on it.

Aligned with: Pragmatic as npm: numeric YAML env values (PORT, TIMEOUT, DEBUG) are the natural way humans write config -- the fix meets users where they are rather than demanding string quoting as a workaround. Portable by manifest: env normalization must be deterministic across all code paths; the translate-mode branch inconsistency (line 748-749) breaks this promise for the same apm.yml input. Secure by default: YAML list/dict values producing Python repr in a config-bake path is a reliability risk worth a follow-up type whitelist. DevX: boolean env values are a trap (str(False) == "False"); lowercase coercion is the correct UX contract and must ship with this fix.

Growth signal. Community contributor Koichi Takahashi fixed a silent failure that would hit any user configuring stdio MCP servers with numeric YAML env values -- an extremely common pattern (PORT, DEBUG, TIMEOUT). The next release post should call this out explicitly as a community-contributed fix. Consider adding a working stdio-with-numeric-env example to the docs quickstart to prevent future confusion and convert this fix into a discovery surface for new users.

Panel summary

Persona B R N Takeaway
Python Architect 0 1 1 New dict-env branch is correctly placed and scoped; one type-consistency gap between translate-mode and legacy-mode dict handling; docstring needs updating.
CLI Logging Expert 1 2 1 Silent coercion of non-string env values needs a verbose-mode diagnostic; bare print() on line 819 violates STATUS_SYMBOLS convention; translate-mode dict path inconsistently leaves non-strings as-is.
DevX UX Expert 1 2 1 str() coercion silently capitalises YAML booleans (false->False); None silently dropped; CHANGELOG entry missing; test coverage thin on edge cases.
Supply Chain Security Expert 0 1 2 No blocking supply-chain issues; new dict-env path is consistent with legacy bake-to-disk behavior; one recommended note on str() of complex YAML values.
OSS Growth Hacker 0 1 1 Good external-contributor fix for a high-frequency silent failure (PORT as int breaks Claude); CHANGELOG entry is absent -- blocks users from knowing the fix shipped.
Doc Writer 0 1 1 CHANGELOG [Unreleased] is missing the Claude stdio env fix entry; existing docs describe intended behavior correctly so no doc drift found.
Test Coverage Expert 0 1 1 Int-to-string coercion is tested (PORT:3000->"3000"); bool coercion (False->"False") and None-value silent-drop have no regression trap; unit tier is correct for this pure function.

B = blocking-severity findings, R = recommended, N = nits.
Counts are signal strength, not gates. The maintainer ships.

Top 5 follow-ups

  1. [DevX UX Expert] (blocking-severity) Boolean env values coerce to 'False'/'True' (capital) instead of 'false'/'true' -- fix in same PR before merge -- str(False)=="False" is a correctness bug introduced by this PR. Missing test confirmed by test-coverage-expert (evidence: outcome=missing). One-liner fix: str(value).lower() when isinstance(value, bool). Must land before merge.
  2. [Python Architect] Translate-mode dict branch (line 748-749) does not apply str() coercion -- same input, different output type depending on code path -- independently flagged by both python-architect and cli-logging-expert. Breaks the dict[str,str] return-type contract. One-line fix mirrors what this PR already does in the legacy branch.
  3. [OSS Growth Hacker] Add CHANGELOG [Unreleased] ### Fixed entry for the stdio env normalization fix before merge -- three personas converged on this (doc-writer, devx-ux, oss-growth). Users upgrading cannot discover the fix without it. Per project convention every user-visible bug fix requires a changelog bullet.
  4. [Supply Chain Security Expert] Add explicit type whitelist (int/float/bool) for str() coercion; warn-and-skip YAML list/dict values -- str(["a","b"]) produces Python repr baked into ~/.claude.json. A whitelist prevents confusing runtime failures and makes the contract explicit.
  5. [Test Coverage Expert] Add None-value-in-dict regression test to pin the intentional silent-drop behavior -- the None-drop is currently undocumented in the test suite. A test asserting "ABSENT" not in env makes the contract explicit and prevents a future change from silently altering behavior.

Architecture

classDiagram
    direction LR

    class MCPClientAdapter {
        <<AbstractBase>>
        +configure_mcp_server(ref) bool
    }

    class CopilotClientAdapter {
        <<ConcreteAdapter>>
        +_supports_runtime_env_substitution bool
        +_last_env_placeholder_keys set
        +_last_legacy_angle_vars set
        +_resolve_environment_variables(env_vars, env_overrides) dict
        +_resolve_env_variable(name, value, env_overrides) str
        +configure_mcp_server(ref) bool
    }

    class EnvVars_DictInput {
        <<ValueObject>>
        +name str
        +value str or int or float or None
    }

    class EnvVars_ListInput {
        <<ValueObject>>
        +name str
        +description str
        +required bool
    }

    class ResolvedEnv {
        <<ValueObject>>
        +name str
        +value str
    }

    MCPClientAdapter <|-- CopilotClientAdapter : extends
    CopilotClientAdapter ..> EnvVars_DictInput : accepts
    CopilotClientAdapter ..> EnvVars_ListInput : accepts
    CopilotClientAdapter ..> ResolvedEnv : returns

    class CopilotClientAdapter:::touched
    classDef touched fill:#fff3b0,stroke:#d47600
Loading
flowchart TD
    A["_resolve_environment_variables(env_vars, env_overrides)"] --> B{"isinstance(env_vars, dict)?"}

    B -- "yes" --> C{"_supports_runtime_env_substitution?"}
    B -- "no" --> F{"_supports_runtime_env_substitution?"}

    C -- "yes (translate mode)" --> D["Branch 1: translate placeholders\nline 742-766"]
    D --> D1{"isinstance(raw_value, str)?"}
    D1 -- "no" --> D2["[BUG] translated[name] = raw_value as-is\nnon-string, may break JSON contract\nline 748-750"]
    D1 -- "yes + placeholder" --> D3["_translate_env_placeholder(raw_value)"]
    D1 -- "yes, literal default" --> D4["raw_value kept literal"]
    D1 -- "yes, other literal" --> D5["dollar-brace NAME (secret guard)"]

    C -- "no (legacy dict, NEW branch)" --> E["Branch 3 [NEW]: coerce dict env\nline 791-802"]
    E --> E1{"isinstance(value, str)?"}
    E1 -- "yes" --> E2["_resolve_env_variable(name, value, env_overrides)"]
    E1 -- "no and not None" --> E3["[FIX] str(value)\n[FOLLOW-UP: bool needs .lower()]"]
    E1 -- "None" --> E4["skip key"]

    F -- "yes" --> G["Branch 2: emit dollar-brace placeholders\nlist input, line 768-789"]
    F -- "no" --> H["Branch 4: legacy list + Rich prompts\nline 804-868"]
Loading

Recommendation

The core fix is correct and should ship. Do not revert or redesign. But the PR needs three targeted changes before merge: (1) fix the boolean coercion bug (str(value).lower() when isinstance(value, bool)) and add a parametrized test covering bool cases -- this is a correctness regression introduced by the PR itself, confirmed missing by the test-coverage expert; (2) apply the same str() coercion to the translate-mode dict branch at line 748-749 to restore type-contract consistency across both paths; (3) add a CHANGELOG [Unreleased] ### Fixed bullet with a user-symptom-first description. All three changes are small (combined under 15 lines) and should be done in the same PR before merge. The print() on line 819 is pre-existing and must not gate this fix; file it as a separate issue.


Full per-persona findings

Python Architect

  • [recommended] Translate-mode dict branch passes non-string values through as raw Python objects; new legacy branch coerces to str -- inconsistent return-type contract at src/apm_cli/adapters/client/copilot.py:748
    At line 748-750, the translate-mode dict branch does translated[name] = raw_value; continue for any non-string value. The new legacy dict branch (line 800-801) does str(value) for the same input shape. Both branches should return dict[str, str] but the translate branch can return dict[str, int|float|...].
    Suggested: Change translated[name] = raw_value to translated[name] = str(raw_value) in the translate-mode dict branch (line 749), mirroring the fix applied here.

  • [nit] Docstring Args section still says env_vars (list) after the fix, but the function now accepts both list and dict at src/apm_cli/adapters/client/copilot.py:720
    The Args block documents env_vars as (list) only. This PR adds a third dict-specific branch, making the dual-type contract more prominent.
    Suggested: Change env_vars (list) to env_vars (list | dict) and add a second bullet explaining the dict shape.

CLI Logging Expert

  • [blocking] print() on line 819 bypasses _rich_* / STATUS_SYMBOLS convention and leaks test-infra noise to stdout at src/apm_cli/adapters/client/copilot.py:819
    Raw print() calls violate the APM output architecture. This specific line also uses an f-string with no interpolation (noqa:F541 comment confirms it). Note: this is pre-existing, not introduced by this PR.
    Suggested: Replace with _rich_info('[i] APM_E2E_TESTS detected, skipping environment variable prompts') or gate behind verbose flag.

  • [recommended] Silent str() coercion of non-string env values emits no diagnostic -- user cannot tell PORT:3000 became string '3000' at src/apm_cli/adapters/client/copilot.py:801
    The new dict+non-translate path silently coerces any non-string value. Under --verbose this should emit at minimum a dim info line so power users can audit what was baked into config.
    Suggested: After resolved[name] = str(value), add a verbose-guarded log: [i] ENV: NAME coerced type->str.

  • [recommended] Translate-mode dict path (line 749) preserves non-string values as raw Python objects -- inconsistent with the new non-translate fix at src/apm_cli/adapters/client/copilot.py:749
    PORT:3000 in translate mode is stored as int, while the new non-translate path coerces to str. Two branches have opposite behaviour for the same input shape.
    Suggested: Replace translated[name] = raw_value with translated[name] = str(raw_value).

  • [nit] Docstring for _resolve_environment_variables still says env_vars is list at src/apm_cli/adapters/client/copilot.py:720
    Suggested: Update to env_vars (list | dict).

DevX UX Expert

  • [blocking] YAML boolean false coerces to 'False' (capital F), not 'false' at src/apm_cli/adapters/client/copilot.py:801
    Python str(False) == 'False'. A user writing DEBUG: false in apm.yml gets env var value 'False', but every shell tool and 12-factor app expects lowercase 'false'/'true'. Silent semantic breakage worse than the original int bug.
    Suggested: Use str(value).lower() if isinstance(value, bool) else str(value). Add a test.

  • [recommended] None values are silently dropped with no user feedback at src/apm_cli/adapters/client/copilot.py:800
    A key present in apm.yml but set to null is skipped without any warning. The user sees the server installed successfully but the env key is absent from ~/.claude.json.
    Suggested: Emit a warning: 'env var PORT has null value in apm.yml and will be omitted from the MCP config'.

  • [recommended] CHANGELOG [Unreleased] ### Fixed section has no entry for this fix at CHANGELOG.md
    Users upgrading who had broken apm.yml with integer env values cannot discover this fix. Every user-visible bug fix must have a changelog entry.
    Suggested: Add bullet: stdio MCP servers with non-string env values (e.g. PORT: 3000) now write correctly as strings to ~/.claude.json / .mcp.json. (#1222)

  • [nit] Test only covers int coercion; bool and float cases untested at tests/unit/test_claude_mcp.py:129
    Suggested: Extend test with DEBUG: False, RATE: 0.5, TIMEOUT: 0 assertions.

Supply Chain Security Expert

  • [recommended] str(value) on a YAML list or nested dict produces Python repr, not a usable env value at src/apm_cli/adapters/client/copilot.py:801
    If a user writes env: {FOO: [a, b]}, YAML parses it as a Python list. str(['a','b']) yields "['a', 'b']" (Python repr) baked into ~/.claude.json. Not an injection risk but silently incorrect.
    Suggested: Use isinstance(value, (int, float, bool)) whitelist; warn-and-skip list/dict types.

  • [nit] bool coercion: str(True)=='True', str(False)=='False' (capital) diverges from YAML/JSON convention at src/apm_cli/adapters/client/copilot.py:801
    Suggested: Use str(value).lower() when isinstance(value, bool).

  • [nit] Non-string literal values bypass _resolve_env_variable -- add inline comment to explain this is intentional at src/apm_cli/adapters/client/copilot.py:800
    Suggested: Add comment: # Non-string literals cannot contain ${VAR} placeholders; coerce directly.

OSS Growth Hacker

  • [recommended] No CHANGELOG entry exists for the fix
    Numeric env values (PORT, DEBUG=1, TIMEOUT=30) are the most natural YAML you write; this bug silently broke every Claude stdio MCP config with non-string env values. A high-visibility user-facing fix MUST appear in [Unreleased] ### Fixed.
    Suggested: Add: stdio MCP servers declared with plain-dict env: blocks (e.g. PORT: 3000) now work correctly with Claude: non-string values are coerced to strings. (#1222) -- by @koichirok``

  • [nit] CHANGELOG entry should lead with user symptom not code jargon
    Suggested: Lead with: If your apm.yml sets numeric env values (e.g. PORT: 3000), Claude now receives them as strings instead of silently dropping them.

Doc Writer

  • [recommended] CHANGELOG [Unreleased] ### Fixed section is missing an entry for this PR's bug fix at CHANGELOG.md
    The fix is a user-visible bug fix. Per project convention every merged fix gets a [Fixed] CHANGELOG bullet. The PR's rebase activity removed unrelated entries but never added one for its own change.
    Suggested: Add to [Unreleased] ### Fixed: apm install --target claude now preserves env entries on stdio MCP servers and normalizes all env values to strings, fixing silent env loss for Claude stdio MCP server configs. (#1222)

  • [nit] dependencies.md says Claude Desktop but adapter targets Claude Code (.claude/) -- label may need a follow-up clarification at docs/src/content/docs/guides/dependencies.md
    Suggested: Verify Claude Desktop vs Claude Code labeling; update to Claude Code if appropriate.

Auth Expert -- inactive

The PR only touches copilot.py env-var normalization (non-string MCP values), test_claude_mcp.py, and CHANGELOG.md -- none of which involve token management, credential resolution, AuthResolver, or remote-host auth semantics.

Test Coverage Expert

  • [recommended] No test for bool env value coercion: False -> "False" is non-obvious and untested at tests/unit/test_claude_mcp.py
    For a bool False, str() produces "False" (capital F). Many downstream tools expect lowercase 'false'. No test covers the bool case. Grepped tests/ -- no hit for bool dict env in the new code path.
    Suggested: Add parametrized case: {"VERBOSE": False, "RETRIES": True} -> {"VERBOSE": "False", "RETRIES": "True"} (pins current contract; catches any future normalisation change).
    Proof (missing): tests/unit/test_claude_mcp.py::test_configure_self_defined_stdio_coerces_bool_env_to_string -- proves: When a server declares a boolean env value, apm install writes the Python str() form to .mcp.json [devx]

  • [nit] None-valued env key is silently dropped; no test asserts the drop is intentional at tests/unit/test_claude_mcp.py
    The legacy dict branch skips any key whose value is None. Undocumented in the test suite. Grepped all tests -- no test exercises the None-value-in-dict case.
    Suggested: Add test: {"PRESENT": "val", "ABSENT": None} -> expected {"PRESENT": "val"} (ABSENT key absent).
    Proof (missing): tests/unit/test_claude_mcp.py::test_configure_self_defined_stdio_drops_none_env_values -- proves: When env dict contains a None-valued key, apm install omits that key from the written .mcp.json

This panel is advisory. It does not block merge. Re-apply the
panel-review label after addressing feedback to re-run.

Note

🔒 Integrity filter blocked 2 items

The following items were blocked because they don't meet the GitHub integrity level.

  • #1224 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".
  • fix: preserve Claude stdio MCP env #1224 pull_request_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".

To allow these resources, lower min-integrity in your GitHub frontmatter:

tools:
  github:
    min-integrity: approved  # merged | approved | unapproved | none

Generated by PR Review Panel for issue #1224 · ● 3.3M ·

@imk1t
Copy link
Copy Markdown
Contributor Author

imk1t commented May 11, 2026

@danielmeppiel @hansonkim
If you would like, it is possible to address #1266 as well in this PR.

@sergio-sisternes-epam sergio-sisternes-epam added panel-review Trigger the apm-review-panel gh-aw workflow status/triaged Initial agentic triage complete; pending maintainer ratification (silence = approval). and removed panel-review Trigger the apm-review-panel gh-aw workflow labels May 13, 2026
imk1t and others added 2 commits May 14, 2026 08:48
@danielmeppiel danielmeppiel merged commit 894d73a into microsoft:main May 14, 2026
8 checks passed
danielmeppiel pushed a commit to edenfunf/apm that referenced this pull request May 15, 2026
…MCP (closes microsoft#1266)

The Codex and Gemini adapters wrote the env block of self-defined stdio
MCP servers to disk verbatim, bypassing the env-var resolution pipeline.
All three placeholder forms (${VAR}, ${env:VAR}, legacy <VAR>) landed as
literal strings in ~/.codex/config.toml and .gemini/settings.json, so
the MCP child processes received e.g. ATLASSIAN_API_TOKEN=${ATLASSIAN_API_TOKEN}
and authentication failed.

Root cause was twofold. The Codex and Gemini _raw_stdio branches never
called _resolve_environment_variables. Even adapters that did call it
(Cursor) hit a latent gap: the legacy-mode branch only handled the
registry list-of-dict shape, so the dict-shape input from self-defined
stdio deps was iterated as keys, every key failed isinstance(..., dict),
and the env block came out as {}. Codex additionally extends
MCPClientAdapter directly, not CopilotClientAdapter, so the resolver
method wasn't even available on the class.

Fix moves _resolve_environment_variables, _resolve_env_variable,
_resolve_variable_placeholders, the env-placeholder regex helpers, and
the per-server placeholder state to MCPClientAdapter so every adapter
shares one resolver contract. Adds a legacy-mode dict-shape branch
(matches the fix in microsoft#1224 for Claude) and rewrites Codex/Gemini's
_raw_stdio branches to route env and args through the resolver.

Cursor's silent-drop latent bug dies as a byproduct of consolidating
the resolver onto the base; pinned by a new regression test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status/triaged Initial agentic triage complete; pending maintainer ratification (silence = approval).

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Claude .mcp.json drops env for self-defined stdio MCP servers

4 participants